An introduction to Linear Mixed Models
ABG-Iowa State University
Understand the basic linear mixed model
Basic model equation
Random or Fixed?
Mixed Model Equations (MME)
BLUP and its properties
Apply Matrix functions in R/Julia to solve MME
Apply R/Julia functions to fit linear mixed effects models
\[
\boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e}
\] where, \(\boldsymbol{y}\) is the vector of observations
\(\boldsymbol{X}\) is the incidence matrix for fixed effects
\(\boldsymbol{\beta}\) is the vector of fixed effects
\(\boldsymbol{Z}\) is the incidence matrix for random effects
\(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) is the vector of random effects
\(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\) is the vector of random residuals
Let’s take some time to discuss the difference between random and fixed effects
If: \(\boldsymbol{x} \sim N(\boldsymbol{\mu},\boldsymbol{S})\), and =, then: \[ \boldsymbol{y}=\sim N(\boldsymbol{K}\boldsymbol{\mu},\boldsymbol{K}\boldsymbol{S}\boldsymbol{K}^{'}) \] Conditional Distributions, if: \[ \begin{bmatrix} \boldsymbol{x} \\ \boldsymbol{z} \end{bmatrix} \sim N\left( \begin{bmatrix} \boldsymbol{\mu} \\ \boldsymbol{\nu} \end{bmatrix}, \begin{bmatrix} \boldsymbol{S} & \boldsymbol{C} \\ \boldsymbol{C} ^{'}& \boldsymbol{W} \end{bmatrix} \right) \]
Then: \[ \boldsymbol{x}|\boldsymbol{z}\sim N(\boldsymbol{\mu}+\boldsymbol{C}\boldsymbol{W}^{-1}(\boldsymbol{z}-\boldsymbol{\nu}),\boldsymbol{C}\boldsymbol{W}^{-1}\boldsymbol{C}^{'}) \]
The model assumptions are summarized in the following equation: \[ \begin{bmatrix} \boldsymbol{u} \\ \boldsymbol{e} \end{bmatrix} \sim N\left( \begin{vmatrix} \boldsymbol{0} \\ \boldsymbol{0} \end{vmatrix}, \begin{bmatrix} \boldsymbol{G} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{R} \end{bmatrix} \right) \]
Which implies: \[ \begin{bmatrix} \boldsymbol{u} \\ \boldsymbol{e} \\ \boldsymbol{y} \end{bmatrix} \sim N\left( \begin{vmatrix} \boldsymbol{0} \\ \boldsymbol{0} \\ \boldsymbol{X}\boldsymbol{\beta} \end{vmatrix}, \begin{bmatrix} \boldsymbol{G} & \boldsymbol{0}& \boldsymbol{ZG} \\ \boldsymbol{0} & \boldsymbol{R} & \boldsymbol{R} \\ \boldsymbol{GZ}^{'} & \boldsymbol{R} & \boldsymbol{ZGZ}^{'}+\boldsymbol{R} \end{bmatrix} \right) \] Show proof of this
The weights of 6 pigs at birth is presented together witht their sex and their dam. \[ \boldsymbol{y}=\begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}, \boldsymbol{X}_1= \begin{bmatrix}1&0 \\0&1 \\0&1 \\0&1 \\0&1 \\0&1 \\\end{bmatrix}, \boldsymbol{X}_2= \begin{bmatrix}1&0 \\1&0 \\1&0 \\0&1 \\0&1 \\0&1 \\\end{bmatrix} \] Which of these two should be considered random and why?
\[ \boldsymbol{y}=\begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}, \boldsymbol{X}= \begin{bmatrix}1&0 \\0&1 \\0&1 \\0&1 \\0&1 \\0&1 \\\end{bmatrix}, \boldsymbol{Z}= \begin{bmatrix}1&0 \\1&0 \\1&0 \\0&1 \\0&1 \\0&1 \\\end{bmatrix} \]
if dam is considered random, we need to make an assumption about their variance. And about the residual variance too.
\[ var\left( \begin{bmatrix} u_{474} \\ u_{475} \end{bmatrix} \right)=\boldsymbol{G}=\boldsymbol{I} \times 0.016; var(\boldsymbol{e})=\boldsymbol{R}=\boldsymbol{I} \times 0.08 \] Notice: \(\sigma^{2}_u= 0.016\), and \(\sigma^{2}_e= 0.08\) What other plausible assumptions can we make? (not about the value of the variances, but about the structure of G and R)
How would these assumptions affect the final results?
How about changing \(\sigma^{2}_u\), and \(\sigma^{2}_e\)?
Dr. Charles Roy Henderson (graduate student with Hazel and Lush at ISU) proposed a method to solve this in his PhD thesis the late 1940s. He later revisited the computation proposing a easier solution
\[
\boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e}
\] whith: \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) is the vector of random effects
\(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\) is the vector of random residuals
Find:
BLUE: Linear estimate of \(\boldsymbol{\beta}\), \(\boldsymbol{\hat{\beta}}\) such that: \(E(\boldsymbol{\hat{\beta}})=\boldsymbol{\beta}\) (i.e: unbiased) and \(Var(\boldsymbol{\hat{\beta}})\) is the minimum among all unbiased estimates (best).
BLUP: Linear predictor of \(\boldsymbol{u}\), \(\boldsymbol{\hat{u}}\) such that: \(E(\boldsymbol{\hat{u}}-\boldsymbol{u})=\boldsymbol{0}\) (i.e: unbiased) and \(Var(\boldsymbol{\hat{u}}-\boldsymbol{u})\) is the minimum among all unbiased estimates (best).
Notice how the definition of unbiased is different for random and fixed effects
The mixed model equations are a simple and elegant for of obtaining BLUE and BLUP
\[ \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z}+\boldsymbol{G}^{-1} \end{bmatrix} \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \end{bmatrix} \]
Let’s discuss this in class.
How to solve this system of equations?
It’s common that \(\boldsymbol{Z}\) is not full rank. Why can we still solve it?
What is the difference betweem treating an effect as fixed or random in terms of this systems of equations?
\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with
\(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{A} \sigma^2_u)\): the vector of random effects
\(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\): the vector of random residuals
The MME can be re-written as: \[ \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \]
with \(\lambda=\sigma^2_e/\sigma^2_u\)
\[ \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \]
\[ \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \left(\begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \right)^{-1} \times \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \]
All that matters is \(\lambda\)!
For the example of pig birth weights:
\[ \boldsymbol{X}^{'}\boldsymbol{X}=\begin{bmatrix}1&0 \\0&5 \\\end{bmatrix}, \boldsymbol{X}^{'}\boldsymbol{X}= \begin{bmatrix}1&0 \\2&3 \\\end{bmatrix}, \boldsymbol{Z}^{'}\boldsymbol{Z}=\begin{bmatrix}3&0 \\0&3 \\\end{bmatrix} ,\boldsymbol{X}^{'}\boldsymbol{Y}=\begin{bmatrix}1.72 \\7.34 \\\end{bmatrix} ,\boldsymbol{Z}^{'}\boldsymbol{y}=\begin{bmatrix}4.66 \\4.4 \\\end{bmatrix} \] \[ var\left( \begin{bmatrix} u_{474} \\ u_{475} \end{bmatrix} \right)=\boldsymbol{G}=\boldsymbol{I} \times 0.016; var(\boldsymbol{e})=\boldsymbol{R}=\boldsymbol{I} \times 0.08; \lambda=5.0 \] \[ \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \left(\begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \right)^{-1} \times \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \] \[ \begin{bmatrix}1&0&1&0 \\0&5&2&3 \\1&2&8&0 \\0&3&0&8 \\\end{bmatrix}^{-1} \times \begin{bmatrix}1.72 \\7.34 \\4.66 \\4.4 \\\end{bmatrix} = \begin{bmatrix}1.719 \\1.468 \\0.001 \\-0.001 \\\end{bmatrix} \] Let’s study the effect of \(\lambda\) in the BLUE and BLUP
Estimates of sex (fixed), sexM-SexF, dam (random) and Dam1-Dam2 for different values of variance ratio \[ \begin{bmatrix}lambda&sexF&sexM&dam474&dam475&dif_sex&dif_dam \\1e-04&1.7183&1.4683&0.0017&-0.0017&0.25&0.0033 \\0.05&1.7184&1.4683&0.0016&-0.0016&0.25&0.0033 \\0.1&1.7184&1.4683&0.0016&-0.0016&0.2501&0.0032 \\0.3&1.7185&1.4683&0.0015&-0.0015&0.2502&0.003 \\0.5&1.7186&1.4683&0.0014&-0.0014&0.2503&0.0028 \\0.75&1.7187&1.4683&0.0013&-0.0013&0.2505&0.0025 \\1&1.7188&1.4682&0.0012&-0.0012&0.2506&0.0024 \\5&1.7195&1.4681&5e-04&-5e-04&0.2514&0.0011 \\10&1.7197&1.4681&3e-04&-3e-04&0.2516&6e-04 \\20&1.7198&1.468&2e-04&-2e-04&0.2518&4e-04 \\\end{bmatrix} \]
As the value of \(\lambda\) increases, there the predicted random effects are squeezed towards zero. In the extreme case in which the variance ratio is zero, the obtained predictions are the OLS solutionsand when the variance ratio is very large, the predicted values are zero.
\[ \begin{bmatrix}lambda&sexF&sexM&dam474&dam475&dif_sex&dif_dam \\1e-04&1.7183&1.4683&0.0017&-0.0017&0.25&0.0033 \\20&1.7198&1.468&2e-04&-2e-04&0.2518&4e-04 \\\end{bmatrix} \] This is a well known property in the prediction of random effects.
Rewrite the model as: \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{\epsilon} \]
with: \[ \boldsymbol{\epsilon} \sim N(\boldsymbol{0},\boldsymbol{V}) \] and: \[ \boldsymbol{V}=\boldsymbol{ZGZ}^{'}+\boldsymbol{R} \] The GLS solutions are also BLUE of \(\boldsymbol{\beta}\): \[ \boldsymbol{\hat{\beta}}=(\boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{X})^{-1}\boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{y} \]
\[ \boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{X}=\begin{bmatrix}10.9375&-3.125 \\-3.125&42.1875 \\\end{bmatrix}, ((\boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{X})^{-1}=\begin{bmatrix}0.0934&0.0069 \\0.0069&0.0242 \\\end{bmatrix} \]
\[ \boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{y}=\begin{bmatrix}14.22 \\56.56 \\\end{bmatrix}, \boldsymbol{\hat{\beta}}=\begin{bmatrix}1.7195 \\1.4681 \\\end{bmatrix} \]
These results are identical to those obtained in slide 11. This also offers a way, for instance, to perform hypothesis testing for fixed effects when accounting for the co-variance between observations.
\[ \begin{bmatrix} \boldsymbol{u} \\ \boldsymbol{e} \\ \boldsymbol{y} \end{bmatrix} \sim N\left( \begin{vmatrix} \boldsymbol{0} \\ \boldsymbol{0} \\ \boldsymbol{X}\boldsymbol{\beta} \end{vmatrix}, \begin{bmatrix} \boldsymbol{G} & \boldsymbol{0}& \boldsymbol{ZG} \\ \boldsymbol{0} & \boldsymbol{R} & \boldsymbol{R} \\ \boldsymbol{GZ}^{'} & \boldsymbol{R} & \boldsymbol{V}=\boldsymbol{ZGZ}^{'}+\boldsymbol{R} \end{bmatrix} \right) \] obtain \(E(\boldsymbol{u}|\boldsymbol{y})\): \[ E(\boldsymbol{u}|\boldsymbol{y})=E(\boldsymbol{u})+cov(\boldsymbol{u},\boldsymbol{y})var(\boldsymbol{y})^{-1}(\boldsymbol{y}-E(\boldsymbol{y}))=\boldsymbol{GZ^{'}}\boldsymbol{V}^{-1}(\boldsymbol{y}-\boldsymbol{X\hat{\beta}}) \]
Example: \[ \boldsymbol{Z}^{'}=\begin{bmatrix}1&1&1&0&0&0 \\0&0&0&1&1&1 \\\end{bmatrix}; \boldsymbol{G}=\begin{bmatrix}0.02&0 \\0&0.02 \\\end{bmatrix}; \boldsymbol{y}^=\begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}; \boldsymbol{X\beta}=\begin{bmatrix}1.72 \\1.47 \\1.47 \\1.47 \\1.47 \\1.47 \\\end{bmatrix} \]
\[ \boldsymbol{V }^{-1}=\begin{bmatrix}10.94&-1.56&-1.56&0&0&0 \\-1.56&10.94&-1.56&0&0&0 \\-1.56&-1.56&10.94&0&0&0 \\0&0&0&10.94&-1.56&-1.56 \\0&0&0&-1.56&10.94&-1.56 \\0&0&0&-1.56&-1.56&10.94 \\\end{bmatrix} \] \[ \boldsymbol{GZ^{'}}\boldsymbol{V}^{-1}(\boldsymbol{y}-\boldsymbol{X\hat{\beta}})= \] \[ \begin{bmatrix}0.12&0.12&0.12&0&0&0 \\0&0&0&0.12&0.12&0.12 \\\end{bmatrix} \times \left( \begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}-\begin{bmatrix}1.72 \\1.47 \\1.47 \\1.47 \\1.47 \\1.47 \\\end{bmatrix} \right) =\begin{bmatrix}5e-04 \\-5e-04 \\\end{bmatrix} \]
A model of this form: \[
\boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e}
\] where, \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) is the vector of random effects
\(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\) is the vector of random residuals
Let’s apply the Cholesky decompositon of \(\boldsymbol{G}\) buy obtaining a lower triangular matrix $ such that: \[
\boldsymbol{G}=\boldsymbol{T}\boldsymbol{T}^{'}
\] Then, replace \(\boldsymbol{u}\) with \(\boldsymbol{u}=\boldsymbol{T}\boldsymbol{u}^{*}\): \[
\boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{T}\boldsymbol{u}^{*}+\boldsymbol{e}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}^{*}\boldsymbol{u}^{*}+\boldsymbol{e}
\] and \(\boldsymbol{u}^{*} \sim N(\boldsymbol{0},\boldsymbol{I})\)
The two models are equivalent in the sense that the observations have the same distribution.
The equivalency between this: \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with: \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\)
And this: \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}^{*}\boldsymbol{u}^{*}+\boldsymbol{e} \] with \(\boldsymbol{u}^{*} \sim N(\boldsymbol{0},\boldsymbol{I})\)
Means that if we can fit independent random effects models, then we can fit any linear mixed model because we can use the “Cholesky trick” to re-compute the incidence matrix and fit an equivalent model.
Note: This may be computationally inneficient!
We saw: \[ \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \left(\begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z}+\boldsymbol{G}^{-1} \end{bmatrix} \right)^{-1} \times \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \end{bmatrix} = \begin{bmatrix} \boldsymbol{C}_{XX} & \boldsymbol{C}_{XZ} \\ \boldsymbol{C}_{ZX} & \boldsymbol{C}_{ZZ} \end{bmatrix} \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \end{bmatrix} \] The elements \(\boldsymbol{C}_{XX}\), \(\boldsymbol{C}_{XZ}\), \(\boldsymbol{C}_{ZZ}\) are key matrices in LMM as they are used to derive variances of predictors:
\[ Var(\boldsymbol{\hat{\beta}})=\boldsymbol{C}_{XX} \] and \[ PEV=var(\boldsymbol{\hat{u}}-\boldsymbol{u})=\boldsymbol{C}_{ZZ} \] and
\[ var(\boldsymbol{\hat{u}})=\boldsymbol{G}-\boldsymbol{C}_{ZZ} \] Now, discuss assymptotic values in class.
Now, we can describe in more detail the content of this course, perusing a general version of the mixed model that we shown before and changing it into a few particular cases:
\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\sum_{k=1}{\boldsymbol{Z}_k\boldsymbol{u}_k}+\boldsymbol{e} \] with: \(\boldsymbol{u}_k \sim N(\boldsymbol{0},\boldsymbol{G}_k)\).
\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{A} \sigma^2_u)\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\).
where: \(\boldsymbol{u}\) represents the vector of sire effects and is a matrix linking the phenotype of each animal (\(\boldsymbol{y}\)) to its corresponding random sire effect.
Also, represents a relationship matrix between sires and \(\sigma\^2_u\) is the sire variance.
\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{A} \sigma^2_u)\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\).
where: \(\boldsymbol{u}\) represents the vector of animal effects and is a matrix linking the phenotype of each animal (\(\boldsymbol{y}\)) to its corresponding “breeding value”.
Also, \(\boldsymbol{A}\) represents a relationship matrix and \(\sigma^2_u\) is the genetic additive variance.
\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\).
where: \(\boldsymbol{u}\) represents a vector of random individual regression coefficient and is a matrix linking the phenotype of each in a specific timepoint to the animal regression coefficients.
Also, \(\boldsymbol{G}\) will have a special structure to account for covariances between animals and between coefficients within animals
\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\sum_{k=1}{\boldsymbol{Z}_k\boldsymbol{u}_k}+\boldsymbol{e} \] Sometimes, like in materal effects models and in social genetic effects models, some random effects will have zeroes in the diagonal of the corresponding \(\boldsymbol{Z}_k\) because the corresponding random effect does not affect the individual’s own record, but the record of an animal that interacts with them. E.g.: it’s mother.
These indirect genetic effect models are used in many genetic evaluations.
In the first 1/2 of the semester we assume that variance components are known. But in the last part of the semester we will learn methods to estimate variance components.
But, here is the trick with unknown Variance Components, when they are estimated from the same data where the BLUPS of \(\boldsymbol{u}\) are obtained:
Best: It’s not guaranteed to be the lowest variance solution
Linear: They are not linear function of the data because variances are now
Unbiased: The most commonly used method to estimate variance component gives biased results, so…
Predictor: It’s still A predictor 8^D
In fact, even with estimated variance components, and with reasonably large datasets relative to model size, the departure from this properties will likely be minimal.